This folder contains materials to replicate all results, tables, and Figures in Samii & West, "Repressed Productive Potential and Revolt: Insights from an Insurgency in Burundi".

To replicate all of the statistical results in the paper and appendix follow these steps:

1. Open the "replication-archive" folder.
2. Be sure that you have subdirectories "figures" and "tables" inside the "replication-archive" folder.
3. Be sure that you have the following data files in the "replication-archive" folder:
"bdi2007_noimpute_civ_rebels.dta"
"bdi08_wgt130321.csv"
"bdi08_rebattrition_long.dta"
4. Open a fresh instance of R, clearing anything from working memory.
5. Be sure all needed packages are installed in R (see "final_script_190405.R" for the set of packages that you need).
5. Set your working directory to the "replication-archive" folder (using the appropriate file path).
4. Run "final_script_190405.R" in R.
5. Run "sumstat.do" in Stata to obtain the summary statistics reported in Table 1.
6. Run "logit-estimates.do" in Stata to obtain the estimates reported in Table F.2.


For the paper, the script "final_script_190405.R" was executed using R v3.5.2 and packages updated as of April 2019, and "sumstat.do" and "logit-estimates.do" were implemented using Stata v15.

A note on the key data files:

1. "bdi2007_noimpute_civ_rebels.dta"
2. "bdi08_wgt130321.csv"
3. "bdi08_rebattrition_long.dta"

The first is the main dataset with all outcome variables, regressors, as well as auxiliary variables from the 2007 survey of civilians and former rebels in Burundi.  This contains additional variables from the survey that we're not used in the analysis.  If users are interested in reference documentation for these other variables, contact Cyrus Samii (see below).


The second is a file that contains the survey weights that account for unequal sampling probabilities.  

The third is roster data the characterizes individual death and survival during the war.  This was constructed off of the survey data.

The other data files ("bdi_ed_tusti.csv" and "bdi_s.csv") are files that are generated by "final_script_190405.R".  These are used in "sumstat.do" (a Stata do file) which computes summary statistics for Table 1 in the paper. 

The file “log-2019-04-19.txt” is a log of a session that shows how the code should run in R.  It includes everything except for the summary statistics, which are computed using the sumstat.do file for Stata.


For tables F3-F8, you need to re-run "final_script_190405.R" but to adjust lines 86 and 87 and then lines 353 and 354 to toggle the age restriction from on to off.  This is explained in the R script.

Here is a guide to the Tables as they appear in the paper and appendix in relation to the output from "final_script_190405.R", "sumstat.do", and "logit-estimates.do":

Main text:
Table 1: constructed from results generated by "sumstat.do".
Table 2: tables/partbasic.tex
Table 3: tables/tutsi-placebo-check.tex
Table 4: tables/jobranks.tex (top panel)
Table 5: tables/jobranks.tex (bottom panel)
Table 6: tables/fitedcor.tex
Table 7: tables/part.tex
Figure 1: figures/agepartint.pdf

Appendix:
Table D.1: tables/attrition.tex
Table D.2: tables/surv-model.tex
Table E.1: tables/dadalive.tex
Table E.2: tables/revenge.tex
Table E.3: tables/selrecruit.tex
Table E.4: tables/partbasic-forappendix-bros-wealth.tex
Table E.5: tables/partbasic-forappendix-networks.tex
Table E.6: tables/partbasic-forappendix-politicalsocialization.tex
Table F.1: tables/partbasic-sens.tex
Table F.2: tables/logit-out.tex
Tables F.3-F.8: Produced by "final_script_190405.R" after adjusting lines 86 and 87 and then lines 353 and 354 to toggle the age restriction off


For any questions, contact Cyrus Samii (cds2083@nyu.edu). 